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Vm. CLAIMS APPENDIX 

The following claims are presented for review, as reflected upon entry of the 
Amendment Under 37 CF.R. §L 11 1 filed on 9/13/04. It is noted by Appellants that the 
listing of clainjs m the Amendment Under 37 C.F.R. §1,111 filed on 4/21/05 have 
typographical errors in the dependencies listed for claims 2, 3, 6, 7, and 10, as well as 
mistakes in the status of these claims, as noted in the Order Returning Undocketed Appeal to 
Examiner, mailed to Appellants on June 13, 2006. These errors are explained as 
typographical errors made by the preparer of that Amendment filed on 4/21/05 when a set of 
claim amendments were initially prepared for the Amendment to be filed and then it was 
decided to revert to the claims as previously presented in the 9/13/04 Amendment. The 
changes to claims 13 and 15 that were made in the 4/21/05 Amendment are not considered 
particularly significant to the issues of this >^eal. 



1. (Previously presented) A method of converting a document corpus containing an ordered 
plurality of documents into a compact representation in memory of occurrence data, said 
method comprising: 

developing a first vector for said entire document corpus, said first vector being a 
listing of integers corresponding to terms in said documents such that each said document in 
said document corpus is sequentially represented in said listing. 



2. (Previously presented) Themethodof claim 19, further comprising: 

developing a third vector for said entire document corpus, said third vector 
comprising a sequential listing of floating point multipliers, each said floating point 
multiplier representing a docxnnent normalization fector, 

3. (Original claim) The method of claim 1, further comprising: 

rearrangitxg, in said first vector, an order of said unique integers within the data for 
each said document so that all identical unique integers are adjacent. 
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4. (Original claim) The method of claim 2, wherein said normalization factor is calculated 
as: 

NF = 1/ (S Xi^)*'^ , where Xi is the niimber of occurrences of a specific tenn in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of all 
term occurrences in said document. 

5. (Previously presented) A method of converting, organizing, and representing in a 
computer memory a document corpus containing an ordered plurality of documents, said 
method comprising: 

for said document corpus, taking in sequence each said ordered document and 
developing a first iminterrupted listing of integers to correspond to an occtirr«nce of terms m 
the document corpus. 

6. (Previously presented) The method of claim 21, further comprising: 

developing a third unintermpted listing for said entire document corpus, said third 
uninterrupted listing containing a sequential listing of floating point multipliers, each said 
floating point multiplier representing a document normalization factor for a corresponding 
document in said document corpus. 

7. (Original claim) The method of claim 5, flirther comprising: 

for each said document in said docxunent corpus, rearranging said unique integers so 
that any identical integers are adjacent. 
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8. (Original claim) The method ofclaim 6, wherein said nonnalizaiion factor is calculated 
as: 

NF = 1/ (Z Xi^)''^ , where Xi is the number of occurrences of a specific term in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of all 
temi occurrences in said document. 

9. (Previously presented) An apparatus for organizing and representing in a computer 
memory a document coipus containing an ordered plurality of documents, said apparatus 
comprising; 

an integer determining module receiving in sequence each said ordered document of 
said document coipus and developing a first uninterrupted listing of said unique integers to 
correspond to an occurrence of terms in the document corpus. 

10. (Previously presented) The apparatus of claim 23, further comprising: 

a nomializer developing a third unintermpted listing for said entire docimient corpus, 
containing a sequential Usting of floating point mxxltipliers, each said floating point multiplier 
representing a document normalization factor for a corresponding document in said 
document corpus. 

1 1 . (Original claim) The apparatus of claim 9, further comprising: 

a rearranger rearranging said unique integers so that any identical integers for each 
said document in said document corpxis axe adjacent 
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12. (Original claim) The apparatus of claim 10, wherein said notmalizer calculates said 
normalization fector as: 

NF = 1/ (£ Xi^)'^ , where Xi is the number of occurrences of a specific term in said 
document, so that NF represents the reciprocal of the square root of the sum of squares of all 
term occurrences in said document. 

13. (Previously presented) A signal-bearing medium tangibly embodying a program of 
machine-readable ixistructions executable by a digital processing apparatus to perform a 
method to organize and represent in a computer memory a document corpus containing an 
ordered plurality of docummts, said method comprising: 

developing a first unintOTupted listing of said unique integers to correspond to the 
occurrence of said dictionary terms in the document corpus. 

14. (Previously presented) The signal-bearing medium of claim 25, wherein said method 
further comprises: 

developing a third unintermpted listing for said entire document corpus, containing a 
sequential listing of floating point multipliers, each said floating point multiplier representing 
a document normalization factor for a corresponding document in said document corpus. 

15. (Original claim) A data converter for organizing and representing in a computer 
memory a document corpus containing an ordered pluraUty of documents, for use by a data 
mining applications program requiring occurrence-of-tenns data, said representation to be 
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based on terms in a dictionary previously developed for $aid document corpus and wherein 
each said term in said dictionary has associated therewith a corresponding unique integer, 
said data converter comprising: 

means fot developing a first uninterrupted listing of said unique integers to 
correspond to the occurrence of said dictionary terms in the document corpus and; and 

means for developing a second uninterrupted listing for said entire document corpus 
containing in sequence the location of each corresponding document in said first 
uninterrupted listing, wherein said first listing and said second listing are provided as input 
data for said data mining applications program. 

16. (Original claim) The data converter of claim 15, fiirfher comprising: 

means for developing a third uninterrupted listing for said entire document corpus, 
containing a sequential listing of floating point multipliers, each said floating point multiplier 
represoiting a docimient normalization factor for a corresponding document in said 
document corpus. 

17. (Original claim) The data converter of claim 15, fiirther comprising: 

means for rearranging said unique integers so that any identical integers for each said 
document in said document corpus are adjacent. 

1 8. (Previously presented) The method of claim 1 , fiarther comprising: 

developing a dictionary conrprising said terms contained in said document corpus; 

and 
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associating, with each said dictionary term, an integer to be uniquely coiresponding to 
said dictionary tenn> said uniquely corresponding integers being said integers comprising said 
first vector. 

19. (Previously presented) Hie method of claim 1 , further comprising: 

developing a second vector for said entire document coipxis, said second vector 
indicating the location of each said document*s representation in said first vector. 

20. (Previously presented) Themethodof claim 5, further comprising: 

developing a dictionary comprising terms contained in said document corpus; and 
associating, with each said dictionary term, an integer to be uniquely corresponding to 

said dictionary term, said uniquely corresponding integers used in said first uninterrupted 

listing. 

21. (Previoiisly presented) The method of claim 5, further comprising: 

developing a second uninterrupted listing for said entire document corpus, said 
second uninterrupted listing containing, in sequence, the location of each corresponding 
document in said first uninterrupted listing. 

22. (Previously presented) The apparatus of claim 9, fiirther comprising: 

a dictionary developing module to develop a dictionary of terms contained in said 
document corpus^ each said term being associated with a corresponding unique integer. 
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23- (Previously presented) The apparatus of claim 9. further comprising: 

a locator module developing a second uxiintemipted listing for said entire document 
corpus> said second uninterrupted listing containing, in sequence, the location of each 
correspondiixg docimient in said first uninterrupted listing. 

24. (Previously presented) The signal-bearing medium of claim 13, wherein said method 
further comprises: 

developing a dictionary comprising terms contained in said document coipus; and 
associating, with each said dictionary term, an integer to be uniquely corresponding xo 

said dictionary term, said uniquely corresponding integers used in said first umntenTq)ted 

listing. 

25. (Previously presented) The signal-bearing medium of claim 13, wherein said method 
further comprises: 

developing a second xmintemipted listing for said entire docinnent corpus, said 
second uninterrupted listing containing, in sequence, the location of each corresponding 
document in said first uninterrupted listing. 
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